Geographically-aware Cross-media Retrieval for Associating Photos to Travelogues
نویسنده
چکیده
Textual documents published on the Web where people describe traveling experiences, usually called travelogues, can provide interesting information about the experiences lived by the respective authors while traveling. Nowadays, several websites can be used for sharing these textual documents, and the use of Web information for travel planning has also increased. Still, the usage of the travelogues by themselves is very restrictive. Intuitively, reading a travelogue while visualizing related photos, like common scenarios and points of interest, could be more useful than reading the text alone, particularly in contexts such as choosing travel destinations. The automatic association of illustrative photos to textual documents such as travelogues can be seen as a cross-media retrieval problem, where the objective is to retrieve the best photos for queries based on the text. These types of retrieval problems are not trivial, not only because of the different types of media being used (i.e., images and textual documents) but also because of the semantic gap existing between the queries and the resources. In this work, we proposed and evaluated different methods to associate georeferenced photos to textual documents such as travelogues. Specifically, we experimented with automatic methods for collecting, selecting and ranking photos, based on their similarity towards the text. The collecting of photos is performed using the API from Flickr, a popular photo-sharing service where photos are often georeferenced (i.e., they contain association to geospatial coordinates for the places in which they were taken). The selection and ranking of photos is based on a set of features that capture multiple notions of relevance between the collected photos and the textual document (e.g., textual similarity, geographical proximity, temporal cohesion, sentimental polarity and visual clustering). The geographical proximity features are based on the distance between locations recognized in the textual document and the places where photos were taken. To perform the recognition of location names in the textual document, we used the Yahoo! Placemaker service. The considered features for estimating relevance were combined through two different approaches, namely by using supervised learning to rank methods (e.g., the Coordinate Ascent algorithm) or using unsupervised rank aggregation methods (e.g., the CombMNZ approach).
منابع مشابه
Associating Relevant Photos to Georeferenced Textual Documents through Rank Aggregation
The automatic association of illustrative photos to paragraphs of text is a challenging cross-media retrieval problem with many practical applications. In this paper we propose novel methods to associate photos to textual documents. The proposed methods are based on the recognition and disambiguation of location names in the texts, using them to query Flickr for candidate photos. The best photo...
متن کاملCross-Language and Cross-Media Image Retrieval: An Empirical Study at ImageCLEF2007
This paper summarizes our empirical study of cross-language and cross-media image retrieval at the CLEF image retrieval track (ImageCLEF2007). In this year, we participated in the ImageCLEF photo retrieval task, in which the goal of the retrieval task is to search natural photos by some query with both textual and visual information. In this paper, we study the empirical evaluations of our solu...
متن کاملDeveloping a Recommendation Framework for Tourist by Mining Geo-tag Photos (Case Study Tehran District 6)
With the increasing popularity of sharing media on social networks and facilitating access to location technologies, such as Global Positioning System (GPS), people are more interested to share their own photos and videos. The world wide web users are no longer the sole consumer but they are producers of information also, hence a wealth of information are available on web 2.0 applications. The ...
متن کاملSemantic Understanding and Commonsense Reasoning in an Adaptive Photo Agent
In a story telling authoring task, an author often wants to set up meaningful connections between different media, such as between a text and photographs. To facilitate this task, it is helpful to have a software agent dynamically adapt the presentation of a media database to the user's authoring activities, and look for opportunities for annotation and retrieval. Expecting the user to manually...
متن کاملSemantic Understanding and Commonsense Reasoning
In a story telling authoring task, an author often wants to set up meaningful connections between different media, such as between a text and photographs. To facilitate this task, it is helpful to have a software agent dynamically adapt the presentation of a media database to the user's authoring activities, and look for opportunities for annotation and retrieval. Expecting the user to manually...
متن کامل